Deprecated Prizm Content Connect
PCCIS Caching Strategies

Why does PCCIS Cache Files?

The power behind PCCIS’ ability to deliver viewable web content with greater performance lies with its cache management. Viewing a multipage document requires that each document page be converted into a web compatible format such as JPEG, PNG or ideally SVG (which gives the highest fidelity upon scaling). Unfortunately, the conversion process is not instantaneous which means there is some delay before a page can be made viewable. Because PCCIS assumes a document will be viewed by more than one person over multiple sessions, it converts all the pages into web viewable intermediate objects that are stored in its cache folders.

The conversion process begins with the first request to view a document page by a given viewing session. Typically, the viewable page data that is generated will then be made available to any subsequent request for the same pages , reducing the time to view to only the time it takes to download the page data to the browser. To summarize, the cached files help deliver viewing performance because the viewing objects are pre-generated and stored in the cache folders.

The Cost of the PCCIS Cache

The cached files require storage on some media device for some period of time. Cached files created for viewing may take up a considerable amount of space, so there is a need to have some control on the growth of the cache files. Fortunately, PCCIS does provide ways to deal with the storage usage demand of the cache with options for controlling both where the files are stored, and how long they are stored there. In fact, the cache contains different purposed folders which can be relocated to different devices which can spread the cache burden out to different devices.

Optimizing Cache Performance

The majority of the PCCIS cache is made up of pre-generated document pages which are readily available on demand. So caching these files is already a help in performance when the same document is viewed repeatedly. While there are three configurable cache folders locations, placing certain ones on more responsive media can result in better viewing experience with less burden on the server hosting the PCCIS service. The use of solid state drives (SSD) or Shared Memory minimizes input/output (I/O) latency and access times for cached files but these storage devices are typically much more confined in storage capacity.

Cache Strategies and Tradeoffs

Several scenarios are proposed below with purposed cache configuration solutions. The user should be familiar with the PCCIS pcc.config file settings as outlined in PCCIS Configuration Options. Along with the pcc.config file, there is a property in the JSON object which the application posts when requesting a new viewing session from PCCIS (see the PCCIS sample and the How To Adjust Caching Parameters for PCCIS topic.

The default settings in the pcc.config file will cause viewing sessions to timeout after 20 mins, and cached files to expire after one day. Also by default, the PCCIS cache folders will all be created within the same parent directory on the root drive. These default settings give a reader 20 minutes to read a document once the viewing session is started. After that time period, a new viewing session will need to be created for them to continue reading the document, either by refreshing their browser, or another mechanism you implement in your application.

The next time the same document is viewed, PCCIS will simply deliver the viewing objects that were created in the first viewing session to the same reader, or to any other reader viewing the same document, for about 24 hours after the first viewing session was created. When a reader (same or new) requests to read the document a day later, the cache process starts over because PCC will have already deleted the cached pages and will have to re-generate all the viewable content of the document again.

Scenario 1:

Viewing response appears slow even with caching enabled as lots of readers are interested in viewing the document.

Solution:

Set the GroupStateFolder tag contents in the pcc.config file to a faster SSD device or with Linux environments set the content to a folder of the Shared Memory device (i.e. /dev/shm). The other cache folders noted in pcc.config, DocumentPath and TempcachePath, could benefit too if they were placed onto faster storage devices.

Example for Shared Memory device:

<GroupStateFolder>/dev/shm/Accusoft/Prizm/GroupState</GroupStateFolder>

<DocumentPath>/dev/shm/Accusoft/Prizm/DocumentCache</DocumentPath>

<TempcachePath>/dev/shm/Accusoft/Prizm/Cache</TempcachePath> 

The above settings in pcc.config set the cache directories to folders in Shared Memory on a Linux OS environment. Being faster than standard disk drives, PCCIS response will be typically quicker with less overall stress on the server to deliver viewing content. 

Scenario 2:

Viewers are getting errors and the storage device used for the PCCIS cache is showing errors because the devices are full.

Solution:

Depending on available storage capacity of the selected device, the cache expiration period specified by CacheExpirationPeriod in pcc.config may need to be shortened to accommodate cache load. Please note that the time period for CacheExpirationPeriod should not be any shorter than the ViewingSessionTimeout time period. Otherwise, the ViewingSessionTimeout will take precedence and the cache expiration period will be forced to the same value. The ViewingSessionTimeout time period can be shortened but at the penalty of reducing the amount of time a user has to read a document in a single viewing session.  

Rather than changing the viewing session timeout period, try changing the size of the (fast) storage device. If not practical to change device storage device size, try moving the TempcachePath content settings to a different storage device and if that isn’t enough do the same for DocumentPath. Splitting cache folders to different dedicated storage devices can benefit performance by reducing disk latency for Hard Disk Drives (HDD) compared to having one HDD serving all the viewing sessions. 

Example for quicker cache cleanup: 

<CacheExpirationPeriod>20m</CacheExpirationPeriod>   <ViewingSessionTimeout>15m</ViewingSessionTimeout> 

The above settings set the viewing session timeout to 15 minutes and the life expectancy of any cached file to 20 minutes. After approximately 35 to 45 minutes, the cached files for a given document will be deleted. The exact time of cleanup can vary based on the scheduled nature of the cleanup processes and current load on the server. 

Scenario 3: 

Your application views a lot of large documents and users are not able to read them in time before they get a viewing session timeout error. 

Solution: 

The default setting in the pcc.config file for ViewingSessionTimeout is 20 minutes. It can be increased to a larger value but that means PCCIS will have more resources to track at any given moment which could affect performance and host server capacity. 

Example of longer viewing session duration: 

<ViewingSessionTimeout>1h</ViewingSessionTimeout>

<CacheExpirationPeriod>1d</CacheExpirationPeriod>    

The above settings increase the ability for users to peruse a given document for an hour. Cache resources for the document will be removed 25+ hours later. As above, there is variability for cache cleanup based on the scheduled nature of the cleanup processes and current load on the server.

Scenario 4:

The documents served are fairly random and not typically shared with others.

- Or -

The image is watermarked uniquely for each viewer and should not be shared.

Solution:

In this scenario, the cache resources are not likely to be needed except for the initial user. There is a property in the JSON object which the application posts when requesting a new viewing session from PCCIS that can be used to disable caching on a per-viewing-session basis. The property, serverCaching, should be set explicitly to the string value “none” when the application requests a POST operation to get a new viewing session ID. Each document uploaded to PCCIS will be converted without PCCIS looking for an existing copy of the document. After the viewing session times out, the cached items for the document will be removed on a predetermined schedule which should be fairly quick because no other viewing sessions are using the data.

Example:

Code snippet in application (see PCCIS sample for more details):

Example
Copy Code
// Request a new viewing session from PCCIS.
 //   POST http://localhost:18681/PCCIS/V1/ViewingSession
 //
 string uriString = string.Format("{0}/ViewingSession", PccConfig.ImagingService);
 HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uriString);
 request.Method = "POST";
 using (StreamWriter requestStream = new StreamWriter(request.GetRequestStream(),
    Encoding.UTF8))
 {
   ViewingSessionProperties viewingSessionProperties = new ViewingSessionProperties();

   // Store some information in PCCIS to be retrieved later.

   //
   //   Setting serverCaching to force a different cache behavior
   //   where cache objects are always created but also more quickly removed.
   //
   viewingSessionProperties.serverCaching = "none";
   // Serialize document properties as JSON which will go into the body of the request
   string requestBody = serializer.Serialize(viewingSessionProperties);
   requestStream.Write(requestBody);
  }

 After the viewing session timeout, the cache items should be removed fairly soon. 

Summary

The PCCIS cache provides a mechanism to deliver document content in a timely matter. However, each application is different and may tax server resources differently or have more demanding requirements. Balancing resource constraints against user experience can be a difficult task that may require compromises. Faster hardware, more specifically high speed storage devices, coupled with an understanding of the options for adjusting how the PCCIS cache behaves should allow you to reach a desired level of performance while maintaining a good user experience.

 

 


©2015. Accusoft Corporation. All Rights Reserved.

Send Feedback